PARALLEL EIGENVALUE EXTRACTION 
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General i zed Ei qenprobl em 

CK]C4»: = CM] CtM 0^3 

N — degrees of freedom 
Required n eigenpairs, n < N 
£ Kj positive-definite 
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New parallel algorithm for the solution of large scale eigenproblems 
in finite element applications. 


• Work is in progress to implement algorithm on NAS Cray 2 computer 
at Ames. 

• Assumptions 

1 - Linear elastic finite element models 

2 2 2 

2 - n lower order eigenpairs are required, i.e. w. 1 <...u> 

3 - [K] is positive-definite 

4 - [M] is semi-positive definite 
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Doma i t i 
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te Element: Model Subdivided into m Domains 
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• Consider a parallel computer with (m+1) processors (tasks). 

• Designate the first processor as a global processor (task). 

• Designate the remaining m-processors as domain processors (tasks). 

• A finite element model can be divided into a number of domains equal 
to m. 

• A star architecture (or tree) is the first to be investigated. 





= CMH C4>IJ c^u 


1 — Creati on ot K e & M e 

2 — E i gen so 1 u t i on (Modified Subspace) 

3 - Equation Solver- (Fnontal Solution) 
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• Three major steps of large computational requirements: 

1 - Creation of element stiffness and mass matrices. 

2 - Extraction of a set of eigenpairs. 

3 - Solution of a set simultaneous linear equations. 


• The merits of selecting the modified subspace method for step #2 and 
the frontal solution for step #3 above all discussed in the next new 
graphs . 
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Modi f i ed Subspa.ce Method 


CVD * +1 = (CKU 1 CMD 

= CK3 -1 CBD a 

wheme SL = 1 ,2,3 
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The Modified Subspace method iterates simultaneously for a subset of 
eigenpairs [<t>,oj] of the generalized eigenproblem: 

1 - Let [V]-| be n starting eigenvectors. Experience has shown that random 

numbers can be used here. A number of techniques are available in 
literature for selecting [V] -j . 

2 - Operate on each [V] as follows 

[v ]* +1 = [<]''["][«], = [K]*’[B] t 

where *,= 1,2,3, . . . . 

* 

3 - Modify [V] to increase convergence rate by one third on average 

* V, - \\ 

where: 3=0 for *=1 and *>11 

X/ 

P a = °- 5 U + Vl )/uj n 

r^_^ are the interval points of the 11-th order Labatto 
rule [-1,1] 


Roots of the 11th Order Lobatto Rule (Kopal 1961) 


r l 

-0.9533098466 

r 6 

0.0000000000 

r 2 

-0.8463475646 

r 7 

+0.2492869301 

r 3 

-0.6861884690 

r 8 

+0.4829098210 

r 4 

-0.4829298210 

r 9 

+0.6861884690 

r 5 

-0.2492869301 

r 10 

+0.8463475646 
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Subspace 


CK ^ +1 = CKD e Cvat +1 

C M ^ + 1 = CM3 e CV3| +1 

The Auxiliary Eigenprobl em 
CK ]* +1 CQD^t = CMD * +1 CQD^-, Cn] 

Improved Eigenvectors 

CV3l +1 = CV3*®-, CQ3 fc+n 


5 L -+-1 
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4 - Project K and M onto the required subspace. 

5 - Solve the auxiliary eigenproblem to obtain [Q]^ and [q] £+ ^. 

6 - An improved set of eigenvectors [V]^ can be obtained. 

2 

7 - Test for convergence on u) n - Repeat steps 2 to 6 until desired accuracy 

is achieved. 


Note 

1. Step #2 is performed using the frontal solution, concurrently within 
each domain. 

2. Steps 1, 3, 4 and 6 are processed concurrently within each domain. 


249 



To! 



250 



• Rate of convergence of the modified subspace is 33% faster on average 
compared to the classical subspace method. 

• Figure shows typical behavior. 

• Most computations are performed on an element by element basis. 
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Frontal Solution 
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Gauss elimination technique. 

Underlying philosophy is based on processing of elements one by one. 

Simultaneous assembly and elimination of variables. 

The optimum frontal width is at most equal to the optimum band width. 

Numbering of nodes has no impact on optimality while numbering of elements 
is important to minimize the frontal width. 

More efficient for solid elements and elements with mid-side nodes. 

It requires a pre-front to determine last appearance of each node. 

It lends itself to parallel solutions. 
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Mul ti -Frontal Solution 


Within each domain 
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For domain i 


[K]m‘ + i = cb] j+ , 

Assembly and elimination gives 

Vd + *dF U *F = B d 
k ff v f = b f 

where IK upper A matrix for domain i 
* 

variables within domain i 

* 

Vp variables along global front of domain i 

& Bp are right-hand sides for domain & global front, respectively 


For global fronts 


m 

K = I K 


FF 


m 

B = I B r 


K Vp = Bp 
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Successful implementation of the new parallel algorithm depends on: 

1 - Maximizing the efficiency of communication links between the global 

task and the domains 

2 - Minimizing sequential computational steps 

3 - Multi-threaded I/O 


Final report will be available in the Summer 1988 



Anti ci pated Benefits 


Parallel eigenvalue extraction 
algorithm to maximize efficiency 
and speed-up of c omp u t a t i o n s . 

A genenal punpose eigenproblem 
solven “Ton finite element anal y s 
in parallel computing envinonmen 
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op. sys. 
start 

copy data file i copy data file 


GLBFRONT 


CONFRONT 


1. read/check first data card 

2. set-up VEC for data input 

3. data input and check 

4. reset VEC for global fronts 

5. pre-front for global fronts 


1. read/check first data card 

2. set-up VEC for data input 

3. data input and synthesis 

4. reset VEC for domain 

5. pre-front for domain 

5. element K and M matrices 

6. domain assembly/elimination 

7. Kpp to GLBFRONT 


6. global fronts solution 

7. V p to DOMFRONT 


8. domain solution and subspace 

9. K* and to DOMFRONT 

8. subspace solution ~ 

9. Q to DOMFRONT 

10. convergence test 10. Improved eigenvectors V e 
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